Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 3456 Updated docs/designs/postgresql-state-store.md based on initial exper… #3457

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

gaffer01
Copy link
Member

…imentation

Make sure you have checked all steps below.

Issue

Tests

  • My PR adds the following tests OR does not need testing for this extremely good reason:
    • Documentation only

Documentation

  • [N/A] In case of new functionality, my PR adds documentation that describes how to use it, or I have linked to a
    separate issue for that below.
  • [N/A] If I have added, removed, or updated any external dependencies used in the project, I have updated the
    NOTICES file to reflect this.

@gaffer01 gaffer01 changed the title Updated docs/designs/postgresql-state-store.md based on initial exper… Issue-3456 Updated docs/designs/postgresql-state-store.md based on initial exper… Oct 10, 2024
@gaffer01 gaffer01 changed the title Issue-3456 Updated docs/designs/postgresql-state-store.md based on initial exper… Issue 3456 Updated docs/designs/postgresql-state-store.md based on initial exper… Oct 10, 2024
item in DynamoDB, with optimistic concurrency control used to add new transactions. To avoid reading the entire
history of transactions when querying or updating the state store, we periodically create snapshots of the state.
To get an up-to-date view of the state store, the latest snapshot is queried and then updated by reading all subsequent
transactions from the DyanmoDB transaction log. This state store enforces a sequential ordering to all the updates to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo at "DyanmoDB".

file references and delete them when a compaction job finishes, we will lose track of what files are in the system
and not be able to garbage collect them. We can avoid this problem as follows. When we add a file and some references,
we also add a dummy file reference, i.e. one where the partiton id is "DUMMY". Normal operations on the state store
ignore these entries. When all non-dummy references to a file have been removed, only the dummy reference will remain.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like all we need from the dummy record is to know that that file exists. If we make a separate table for which files exist, we could avoid needing to set values for fields that don't have any meaning for this record. Is there any downside to having a separate table for that?

tasks running at the same time and they all have a connection to the instance, that will put the instance under load
even if those connections are idle. Each execution of a compaction job should create a PostgreSQLStateStore, do the
checks it needs and then close the connection. We could add a method to the state store implementation that closes
any internal connections or we could create the state store with a connection supplier than provides a connection
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like "than" should be "that".

the idea of asynchronous commits from the transaction log state store, i.e. an SQS queue that triggers a lambda to
perform the updates. However, in this case we do not want it to be a FIFO queue as we want to be able to make
concurrent updates. We can set the maximum concurrency of the lambda to control the number of simultaneous updates to
the state store.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maximum concurrency would be shared between all Sleeper tables. If a larger number of tables were actively updated, we wouldn't necessarily get this effect of controlling the simultaneous updates to a state store, because we'd need to set it high enough for all the tables. It seems unlikely that would cause a problem though.


This has some differences to the rest of Sleeper, which is designed to scale to zero by default. Aurora Serverless v2
does not support scaling to zero. This means there would be some persistent costs unless we explicitly pause the Sleeper
instance and stop the database entirely.
Copy link
Collaborator

@patchwork01 patchwork01 Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep this information about Aurora in the document?

With higher levels of transaction isolation, you can produce the same behaviour as a conditional update in DynamoDB.
If a conflicting update occurs at the same time, this will produce a serialization failure. This would require you to
retry the update. There may be other solutions to this problem, but this may push us towards keeping transactions as
small as possible.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still seems possible this will be a problem. From the manual it sounds like there are cases where page locks are acquired even with serializable isolation level. This could still affect us even when we're certain we never update the same records in multiple places.

I can imagine myself coming back to look at this and not being able to find everything. It might be useful to keep a record of the logic behind this, at least why we use serializable isolation level, what we were worried about, and some detail of how it affects us. Can we bring some of this back?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update description of design of potential PostgreSQL state store
2 participants